MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees. (arXiv:2209.07225v2 [cs.LG] UPDATED)
Multi-agent reinforcement learning (MARL) recently has achieved tremendous
success in a wide range of fields. However, with a black-box neural network
architecture, existing MARL methods make decisions in an opaque fashion that
hinders humans from understanding the learned knowledge and how input
observations influence decisions. Our solution is MIXing Recurrent soft
decision Trees (MIXRTs), a novel interpretable architecture that can represent
explicit decision processes via the root-to-leaf path of decision trees. We
introduce a novel recurrent structure in soft decision trees to address partial
observability, and estimate joint action values via linearly mixing outputs of
recurrent trees based on local observations only. Theoretical analysis shows
that MIXRTs guarantees the structural constraint with additivity and
monotonicity in factorization. We evaluate MIXRTs on a range of challenging
StarCraft II tasks. Experimental results show that our interpretable learning
framework obtains competitive performance compared to widely investigated
baselines, and delivers more straightforward explanations and domain knowledge
of the decision processes.
( 2
min )